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Abstract. Genome- Wide Association Studies (GWAS) offer an excit- 
ing and promising new research avenue for finding genes for complex 
diseases. Traditional case-control and cohort studies offer many advan- 
tages for such designs. Family-based association designs have long been 
attractive for their robustness properties, but robustness can mean a 
loss of power. In this paper we discuss some of the special features of 
family designs and their relevance in the era of GWAS. 
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1. INTRODUCTION 

The potential of genome-wide association studies 
(GWAS) to enable an unbiased search for disease 
loci across the entire human genome provides us 
with an unprecedented research opportunity in ge- 
netics. Interrogating several hundred thousand SNPs 
across many subjects at the same time raises many 
statistical challenges in the design and analysis of 
these studies. Genotyping on such a scale requires 
new methodology for handling data quality issues; 
likewise, association tests are computed for hundreds 
of thousands of markers, whose results have to be 
adjusted for multiple comparisons. The magnitude 
of these problems raises the question of whether the 
new technical ability to genotype such dense SNP 
sets will translate into the identification of novel ge- 
netic disease loci or whether the technical advance 
remains under-utilized. 

A popular way to address the multiple testing in 
genome-wide association studies has been to design 
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studies with a sample size of several thousand sub- 
jects that are large enough that realistic effect sizes 
can be detected, assuming that the test results will 
be corrected for multiple testing using the Bonfer- 
roni approach. However, such large studies come at a 
price. By putting together samples of several thou- 
sand subjects, phenotypic and genetic heterogene- 
ity will be encountered in the sample. Further, since 
the need for large sample sizes also influences the 
study-design choice, the most commonly used design 
choice is a case-control sample of unrelated individ- 
uals with minimal or no covariates. Another popular 
approach is a population-based design of unrelated 
individuals without ascertainment condition related 
to the outcome of interest (e.g., studying obesity in 
a general population sample). In any event, the as- 
certainment of subjects and collection of their phe- 
notypic data is rarely carried out specifically for the 
GWAS; rather, the expense of the genotyping has 
led investigators to rely on samples previously col- 
lected and phenotyped for other studies, in some 
cases, large family samples that have been previ- 
ously collected for other genetic studies. Although 
the cost of genotyping is dropping rapidly, the cost 
of genotyping still tends to drive study design and 
make power considerations very crucial in the de- 
sign. 

An alternative approach to population-based or 
case-control studies of unrelated individuals is family- 
based studies. Family-based studies were used in 
association studies originally to provide protection 
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against spurious association arising with population 
substructure. Family designs offer some unique ad- 
vantages at the design and analysis phase of a GWAS. 

Their complete robustness against heterogeneity 
at a phenotypic and genetic level allows the joint 
analysis of arbitrarily large and diverse samples with 
family designs, an advantage in the GWAS setting. 
As we will discuss in Section 3, they have both draw- 
backs and benefits over conventional designs when 
genotyping errors are present. We will also discuss 
two-stage test strategies for family designs that main- 
tain the original robustness of the approach, while 
achieving power-levels that are similar to those of 
population-based studies. 

Our objective in this paper is to first describe 
some of the special features of family-based designs 
that make them attractive for association studies, 
then focus particularly on their use in GWAS's with 
regard to genotyping errors and potential for ad- 
dressing the multiple comparison problem. 

2. OVERVIEW OF FAMILY DESIGNS FOR A 
SINGLE MARKER 

It has long been recognized that various sorts of 
population substructure can distort tests of associ- 
ation because different populations may have differ- 
ent disease rates, and/or genotype frequencies 
(Devlin and Roeder (1999); Pritchard, Stephens and 
Donnelly (2000); Whittemore (2006)). Family de- 
signs for genetic association studies were originally 
suggested (Falk and Rubinstein (1987); Ott (1989); 
Spielman, McGinnis and Ewens (1993)) as a way of 
avoiding spurious association due to population sub- 
structure. The classic paper by Speilman, McGin- 
nis and Ewens (1993) on the Transmission Disequi- 
librium Test (TDT) has contributed much to their 
general popularity. There are many variations on 
the family design, but the simplest and generally 
most powerful design consists of selecting affected 
offspring and their parents, and genotyping the trio. 
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P(AA)= Vi P(AB)= Vz 

Fig. 1. Trio design. 



Essentially, having the genotypes of the parents en- 
ables one to take advantage of "Mendelian Random- 
ization" to avoid the need for an explicit control 
group. Under the null hypothesis of no association 
between the disease and the marker, each parent 
transmits one of their two alleles to each offspring, at 
random with probability 50/50 and independently 
of the other parent and of any other offspring. For 
the example in Figure 1, the mother can only trans- 
mit the A allele, but the father can transmit either 
A or B with probability 50/50. This holds when- 
ever there is no selection of the offspring related 
to the marker in question. Thus, when the parent's 
genotypes are known, one can easily calculate the 
distribution of the offspring genotypes under HO. 
This distribution is used to construct tests of the 
null hypothesis. The observed and expected counts 
can be used to construct an asymptotic x 2 test (Ott 
(1989); Spielman, McGinnis and Ewens (1993)) or 
exact tests can be used (Lazzeroni and Lange (1993)) 
Because parents transmit independently to different 
offspring, multiple affected siblings can be used, re- 
sulting in a potential savings in genotyping costs. 
With more common diseases, using transmissions to 
unaffected siblings may also be beneficial 
(Lange and Laird (2002)). 

A Class of Score Tests for Family Designs 

A more precise statistical argument regarding the 
robustness of the family designs can be made by 
considering the basis for the TDT test. The sim- 
ple TDT test is a score test, based on the likeli- 
hood of the offspring genotypes, conditioned on the 
offspring trait and the parental genotypes (Schaid 
(1996)). To develop this likelihood in a general set- 
ting, let P denote the parental genotypes of a trio, 
Y denote the trait of the offspring (here the trait 
can be arbitrary), and let X denote some numeri- 
cal coding for the offspring genotype, for example, 
number of A alleles or a dummy variable coding for 
a recessive or dominant genetic model. Further, let 
f(Y\X,P,9) denote the probability density of the 
offspring trait, conditioned on the offspring geno- 
type, the parental genotype and a vector of unknown 
parameters, 8. In genetic terminology, f(Y\X,P,6) 
is the penetrance function and specifies the genetic 
disease model. Generally, f(Y\X,P,9) is assumed 
not to depend directly on the parental genotypes 
when offspring genotypes are in the model, but we 
leave them in for generality. The vector 9 will con- 
tain both association parameters, say, /3, and nui- 
sance parameters, say, a, which will describe other 
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aspects of the trait distribution. In particular, we 
parameterize so that f(Y\X,P,9) = f(Y\X/3,P,a), 
and under the null, ,9 = 0, so that f(Y\X,P,/3 = 
0,a) = f(Y\P, a), that is, the distribution of the 
trait does not depend on the marker genotypes of 
the offspring under the null. Further, let f(X\P) 
be the probability density of the offspring genotype 
conditioned on parental genotype. Note that the lat- 
ter is completely known and determined by Mendel's 
laws, whereas the former reflects our alternative hy- 
pothesis, and is generally unknown. 

The conditional likelihood for the offspring geno- 
type (X) given parental genotypes (P) and the off- 
spring trait (Y) is given by 

f(X\Y,P,9) = f(Y\X,P,9)f(X\P) 

/J2f(Y\X,P,9)f(X\P), 

where summation is over all X compatible with P. 
An important feature of conditioning on P is that 
any nuisance parameters in the distribution of the 
parental genotypes, such as allele frequencies and 
random mating assumptions, are not needed. As 
noted above, the penetrance function does not de- 
pend on X under the null, and hence cancels out of 
the likelihood. Thus, the distribution of X under the 
null is given simply by f(X\P), which is completely 
determined by Mendel's laws; no assumptions need 
be made about the distribution of parental geno- 
types or about the phenotypes. Thus, a score test 
will have the correctly specified null distribution as 
long as Mendel's laws hold, and will be completely 
robust to not only population substructure, but to 
potential misspecification of the trait distribution as 
well. 

In the TDT, we condition on Y = 1 and let X 
denote the number of a particular allele that an in- 
dividual has. The model p(Y = 1|X, P, 9) can take 
any form, logistic, log-linear, linear, etc., with a 
modeling the probability for X = 0. A simple form 
for the penetrance function, which provides a gen- 
eralization of the TDT for any phenotype, can be 
obtained by assuming an exponential family model 
for the trait distribution with a generalized linear 
model for the mean response (Lunetta et al. (2000); 
Liuet al. (2002); Dudbridge (2008)). In this case, 
the score takes the special form of a type of covari- 
ance between the trait and the marker: 

(2) U = J2[(Y - E(Y))][X - E(X\P)}, 



where summation is over all trios. Here E(Y) is 
the mean trait under HO and may depend upon 
the unknown nuisance parameters a, and E(X\P) 
is computed using only Mendel's laws. An asymp- 
totic Z (or x 2 ) test statistic is formed by normalizing 

(2) by the square root of £(Y - E(Y)) 2 var(X|P), 
where v&r(X\P) can also be computed simply from 
Mendel's Laws. Alternately, exact tests using 
Mendel's laws to compute f(X\P) can be easily 
calculated (Lazzeroni and Lange (1993) and 
Schneiter, Laird and Corcoran (2005)). 

A potential barrier to constructing score tests in 
this general case is in estimating the nuisance pa- 
rameters a. Standard likelihood ratio methods can- 
not be used here, because under the null, the like- 
lihood does not depend on 9 and the a parameters 
cannot be estimated. The case of trios, where all 
offspring are affected (Y = 1), is special in this re- 
gard. Here, Y — E(Y) is constant for everyone, and 
because we condition on Y, the score test can be 
reformulated as 

(3) U = J2[X - E(X\P)]. 

It is easily seen that this score test yields the TDT 
when X is coded to count the number of alleles of 
interest (Schaid (1996)). 

If we include unaffected offspring (Y = 0) as well 
as affected, then equation (2) still holds, but the test 
now depends upon estimating the prevalence E(Y) 
because (Y — E(Y)) is not constant. If selection of 
subjects depends upon disease status, then preva- 
lence cannot be estimated from the sample data, 
but often some a priori information is available. In 
the more general case of measured phenotypes, the 
test depends on the specified disease model via the 
nuisance parameters implicit in E(Y) and remains 
valid regardless of choice of disease model provided 
Mendel's laws hold. While model choice can affect 
power (Lange and Laird (2002); Lange, DeMeo and 
Laird (2002)), choice of the wrong disease model 
does not affect robustness, as the test is conditioned 
on the trait. When samples are selected on the ba- 
sis of the disease trait, as is generally the case with 
dichotomous traits, the nuisance parameters cannot 
be estimated from the data; methods for specifying 
E(Y) have been suggested (Lunetta et al. (2000); 
Lange and Laird (2002); Lu and Cantor (2007); 
Dudbridge (2008)). 
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Missing Parental Information 

Missing parental genotype information is a com- 
mon problem, especially for later onset diseases. 
There have been several approaches suggested for 
handling missing parents, including estimating a 
model for the parental genotypes distribution, and 
using joint likelihood ratio tests (Weinberg (1999)) 
or using score tests which average over the estimated 
distribution of the parental genotypes (Clayton 
(1999)) for families with missing parents. These ap- 
proaches are not guaranteed to retain robustness to 
population substructure, especially since both ap- 
proaches generally make simplifying assumptions con- 
cerning the distribution of the parental genotypes 
(e.g., common allele frequencies and Hardy- Weinberg 
equilibrium); see Dudbridge (2008). Alternatively, 
when siblings are sampled, f(X\P) can be replaced 
in the above equations by f(X\S), where S denotes 
the sufficient statistic for parental genotype 
(Rabinowitz and Laird (2000)). Being the sufficient 
statistic, f(X\S) again does not depend upon 
a model for parents' genotype distribution, and the 
score test remains fully robust. The distributions 
are simple to enumerate, and tests based on (l)-(2) 
with f{X\P) replaced by f(X\S) if parents are not 
available can be implemented in the FBAT 
www.biostat.harvard.edu/~fbat/ or PBAT www. 
biostat.harvard.edu/~clange software packages. How- 
ever, the power of these tests can be much reduced, 
depending upon the number of additional siblings 
available. We refer to the FBAT test to describe this 
general class of score tests which extends the TDT 
to other traits and other family designs. 

In summary, conditioning on both the parental 
genotypes and the offspring traits ensures robust- 
ness against misspecification of the disease model, 
and to the distribution of offspring genotypes under 
the null. The general approach has been extended 
to handle multiple siblings (Lange and Laird (2002) 
and Lange, DeMeo and Laird (2002)), missing par- 
ents (Rabinowitz and Laird (2000)), multiple traits 
(Lange et al. (2003)), haplotypes (Horvath et al. 
(2004)) and multiple markers (Xu et al. (2006); 
Rakovski et al. (2007)). 

Comparative Power Issues: Single Marker Case 

By and large, most approaches for analyzing 
GWAS studies, conventional or family designs, begin 
by testing each marker separately, and then do an 
adjustment for multiple comparisons to determine 



genome- wide significance and/or select promising 
SNPs or regions for further study based on rankings 
of some sort. There have been several proposals for 
alternative methods of testing to increase power in 
the face of multiple testing, as we will discuss in Sec- 
tion 5, but, by-and-large, the genome-wide power of 
a GWAS is usually estimated by calculating power 
for a single marker, using some appropriate alpha- 
level to adjust for multiple comparisons; thus, com- 
parative power issues for single markers translate 
directly to power calculations for genome- wide stud- 
ies. 

We note that this one-marker, one-test approach 
is in strong contrast to genome- wide linkage scans, 
where one can at least approximate the null distri- 
bution of the test statistic across the genome, for ex- 
ample, maximized lod-score, under the null hypoth- 
esis of no linkage (Feingold, Brown and Siegmund 
(1993)). With dense association scans, the unknown 
pattern of LD precludes specification of the joint 
distribution of the association test statistics under 
the null of no association. In principle, using per- 
mutation tests in case-control studies can consider- 
ably improve the probability of at least one posi- 
tive finding, but the magnitude of the computations 
are prohibitive in a GWAS with hundreds of thou- 
sands of SNPs. An exception to the one test per 
typed SNP are methods which incorporate informa- 
tion from the Hapmap to impute non-typed SNPs, 
gaining additional power via testing a denser marker 
set (Marchini et al. (2007)). Thus far, this approach 
has been limited to case-control data and investiga- 
tion of methodology for family designs is desirable. 

Family-based tests, being conditional tests, are ro- 
bust and essentially model free, but the price of such 
robustness is some cost in terms of power. There are 
some cases, and some designs, however, where the 
power is essentially equivalent, as was shown for rare 
disease and the additive model in Laird and Lange 
(2006). Here we consider power comparisons for the 
recessive model with an a-level of 0.00001 to more 
nearly reflect a GWAS testing situation. Figures 2 
and 3 compare the power of four different designs: 
case-control, trios, discordant sib pairs (DSP) and 
discordant sib trios (DST; at least one discordant 
sib pair and one other sibling) , for a rare disease and 
a common one. The odds ratio is 1.75 in both cases, 
and the number of affected (1500) is the same for 
each design, although number of genotypes required 
can be different depending on design. The DSP de- 
sign is always very inefficient, whereas DST can do 
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Fig. 2. Power for a common disease: 14%. 



well with more common disorders. For the recessive 
model, the power of the case-control design and the 
trio design are virtually identical for common dis- 
eases (e.g., prevalence 14%), with minor advantages 
for trio designs for low allele frequency and minor 
advantages for the case-control design for common 
alleles. However, for rare diseases, the trio design is 
much for powerful than the case/control design. The 
reason for the relative power loss of the case/control 
design is that, for rare diseases, the differences be- 
tween the genotype distribution of healthy controls 
and the genotype distribution of the general popu- 
lation are minimal and the contribution of the con- 
trols to the power of the test statistic diminishes. For 
the trio design, we use only cases and, consequently, 
such designs do not suffer this relative power loss for 
small prevalences. The power results for the trios dif- 
fer slightly by prevalence because we base our model 
on the odds ratio rather than the relative risk model. 
We provide some simple algebraic calculations in the 
Appendix to illustrate this point. 
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Fig. 3. Rare disease: 1%. 

3. QUALITY CONTROL/DATA CLEANING IN 
FAMILY DESIGNS 

The large amount of genotyping required for a 
GWAS is accomplished via specially designed geno- 
typing platforms commonly called SNP-chips. Geno- 
typing errors include several types of failures that 
can occur in the genotyping process; these can re- 
sult in either missingness and/or misclassification 
of genotypes. The raw data of a single genotype 
for a single individual is a pair of measured inten- 
sities for each allele; the intensities are translated 
into genotypes, generally using some type of statis- 
tical clustering algorithm, referred to as the 'geno- 
type calling algorithm.' Perhaps due to poor DNA 
quality or design issues of the SNP-chip, the sample 
may simply fail to provide intensities or the intensi- 
ties do not separate into the three possible genotype 
clusters, making it impossible to obtain called geno- 
types. These errors all give rise to missing genotypes. 
Further missingness arises in the data cleaning pro- 
cess which is described below. Misclassification oc- 
curs if the calling algorithm makes a genotype call 
which is not correct; the probability for misclassify- 
ing a genotype generally increases with lower minor 



6 



N. M. LAIRD AND C. LANGE 



allele frequencies, and can depend upon the true, 
unobserved, genotype. 

In the data cleaning step of a GWAS, basic sta- 
tistical analysis tools are used as quality control fil- 
ters to identify SNPs and probands for which the 
SNP-chip is not able to provide sufficient genotyping 
quality (Manolio et al. (2007)). Such analysis tech- 
niques/filters include tests for departures from the 
Hardy- Weinberg Equilibrium, removal of SNPs with 
low frequencies or low "call rates," or deletion of in- 
dividuals with low call rates. Since the inclusion of 
SNPs and probands with misclassified genotypes can 
lead to a substantial reduction in power, the data 
cleaning/filtering step is one of the most important 
parts in the analysis of a GWAS. While there has 
been much progress in improving genotype calling 
and data cleaning algorithms, we can expect that 
there will continue to be some level of missing and 
misclassification in all GWAS's. 

When family data are used, an additional quality- 
control filter that is applied in the data cleaning step 
is the removal of Mendelian inconsistencies. Mendelian 
inconsistencies are genotype configurations in fam- 
ilies that violate Mendel's Law. For example, if a 
"B" -allele is observed in a subject whose parents 
do not carry any "B" -alleles, this is an obvious vi- 
olation of Mendel's law. Such genotype configura- 
tions are excluded from the analysis. Furthermore, 
if Mendelian inconsistencies are more frequent for 
certain markers or families, this suggests that there 
are fundamental problems with the genotyping for 
these markers/families and, it is common practice 
to exclude them from the analysis altogether. Mark- 
ers and/or families with more than five Mendelian 
inconsistencies are generally removed from the anal- 
ysis. 

For population-based designs, the presence of geno- 
typing errors resulting in either misclassification and 
or missing genotypes does not cause bias under the 
null provided errors/missingness is non-differential 
in cases and controls. By non-differential, we mean 
errors occur irrespective of case or control status. 
Genotyping cases and controls separately can lead 
to differential genotyping errors, and considerable 
bias in association tests. With non-differential geno- 
typing errors, there is no bias under the null and the 
effect of the genotyping error is to simply decrease 
the overall power of the GWAS for the population 
design. 

For family-based designs, the effects of genotyp- 
ing errors are different. It is a well described phe- 
nomenon in the literature (Gordon et al. (2001), 



2002; Douglas, Skol and Boehnke (2002); Sobel, 
Papp and Lange (2002); Kang, Gordon and Finch 
(2004)) that genotyping errors can cause biased tests 
with inflated significance levels. With families, it 
will be possible to identify some of the misclassi- 
fied genotypes by verifying that the offspring's geno- 
type is not plausible based on the parental geno- 
types (or in some cases, sibling genotypes). However, 
by removing families with transmission inconsisten- 
cies from the association analysis, only a fraction 
of the genotyping error is eliminated from the anal- 
ysis. In the computation of the test statistic, this 
causes a seeming over-transmission of the major al- 
lele, which leads to the anti-conservativeness of the 
family-based association test. 

Thus, while population-based studies have reduced 
power in the presence of genotyping errors, family- 
based studies will, in addition to that, have inflated 
pre-specified significance levels. To judge the rela- 
tive importance of this fundamental difference be- 
tween the two study-design types, it is important to 
consider the main purpose of GWA studies. Their 
goal is the discovery of new genetic disease loci and 
their confirmation/replication in independent stud- 
ies/samples. This is typically achieved by selecting 
the markers with the smallest p- values from the GWA 
and trying to confirm/replicate them in independent 
studies. It is obvious that the presence of genotyp- 
ing errors will reduce the overall power of both de- 
sign types, either because of reduced power (case- 
control) or by both reduced power and inflated type- 
1 error (family designs). However, it is unclear for 
which design type these effects are more deleterious 
and careful simulation studies are much needed to 
address this issue. 

In practice, it will be important to estimate the 
undetected genotyping error rate in the data in or- 
der to assess the reduction in overall-power of the 
GWA study that is attributable to this error source. 
Otherwise, if a GWA is unable to identify new loci, 
it is unclear whether this is due the actual absence 
of genetic risk loci or due to the reduction in overall 
power caused by poor genotyping quality. Family- 
based studies offer a unique possibility to estimate 
the undetected genotyping error rate. By looking at 
the transmission pattern of the common allele for 
all genotyped markers in a GWA study, an overall/ 
genome-wide FBAT statistic can be computed and 
the undetected genotyping error rate in the study 
can be estimated through simulations under various 
error models (Fardo, Ionita and Lange (2008)). 
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4. TESTING STRATEGIES FOR THE 
MULTIPLE COMPARISON PROBLEM IN 
GENOME-WIDE ASSOCIATION STUDIES 

With mapping arrays for more than one million 
SNPs now available (Matsuzaki et al. (2004); 
Di et al. (2005); Gunderson et al. (2006); Wadma 
(2006)), genome-wide association studies carry the 
promise to identify replicable associations between 
important genetic risk factors and most complex dis- 
eases. One of the major hurdles that needs to be 
addressed in order to make genome-wide associa- 
tion studies successful is the multiple comparison 
problem. Hundreds of thousands of SNPs are geno- 
typed and examined for potential associations with 
multiple phenotypes, possibly using different model 
assumptions, resulting in potentially millions of sta- 
tistical tests. 

Initial efforts to resolve this problem with case- 
control designs were directed toward multi-stage de- 
signs involving multiple independent samples. At 
stage 1, all SNPs are tested in a relatively small sam- 
ple and the most significant ones retained for test- 
ing with a larger, independent sample; the winnow- 
ing process can be repeated multiple times. How- 
ever, Skol et al. (2006) showed that such designs are 
inherently less powerful than designs which use all 
samples for the final analysis of selected SNPs, even 
though Bonferroni adjustment must be made for 
testing all SNPs. Thus, the desired strategy now for 
population based designs is to select a large enough 
sample (3-5000 cases and an equal number of con- 
trols) to achieve sufficient power for all SNPs simul- 
taneously, but also utilize independent "replication" 
samples which are different from the original sample 
in some distinct way, for example, non-overlapping 
populations. 

Other strategies to ameliorate the multiple com- 
parisons problem utilize some "outside" informa- 
tion, for example, information from linkage stud- 
ies, functional SNPs, etc. Such approaches include 
Bayesian approaches which use prior distributions 
to specify effects for markers (Wakefield (2008)), 
weighted Bonferroni methods which assign different 
significance levels to each SNP according to their 
"importance or relevance" (Roeder, Devlin and 
Wasserman (2007); Eskin (2008)) and split-sample 
approaches (Wasserman and Roeder (2006); 
Song et al. (2007)). For family-based association 
tests, the idea of using "outside information" natu- 
rally translates to the use of the information about 



the association at a population-based level that is 
not utilized in the family-based association test. 

A general approach to two-stage testing for fam- 
ily designs builds on the two information sources 
about association that are present in family-based 
designs. Using the notation introduced in Section 2 
for the distribution of X and Y, the joint distribu- 
tion for X, Y and P (or, equivalently, S) can be 
partitioned into two statistically independent com- 
ponents (Laird and Lange (2006)), 

(4) f(X, Y, 9) = f(X\Y, P, d)f(Y, 9), 

where <3? represents additional parameters required 
to model the parental genotype distribution, for ex- 
ample, genotype frequencies and possible non-random 
mating. Note that both components, f(X\Y,P,6) 
and f(Y,P\$>,0) will have information about 6, but 
the information from f(Y, P|<3?,(9), will depend on 
the parental genotype distribution, and can be sen- 
sitive to population substructure. 

For the first step of the testing strategy, the screen- 
ing step, we use the information in f(Y, P\$, 9), to 
estimate the association parameters; the second, or 
testing step, uses f(X\Y,P,$>,8). The likelihood de- 
composition implies that both steps of the testing 
strategy are independent. The "evidence for associ- 
ation" estimated from f(Y, P\&, 9) can be utilized in 
the testing stage, without having to adjust the test 
for the estimation of the genetic effect size in the 
first stage. Several methods have been suggested to 
exploit this relationship in developing testing strate- 
gies which use both forms of information in order 
to increase power, while retaining robustness of the 
test. 

Van Steen et al. (2005a) originally proposed a ver- 
sion of this two-step testing strategy for the analy- 
sis of quantitative traits. First, an effect size is es- 
timated for each SNP by regressing the offspring 
phenotype Y on E(X\P); this effect size is used to 
calculate the estimated power of the FBAT statistic 
for each SNP (Lange and Laird (2002)). Some num- 
ber of top ranking SNPs (10 or 20) were selected 
for testing with the FBAT statistic at the second 
stage. Because of the independence, both steps can 
be applied to the same data set without having to 
adjust the overall significance level for the multiple 
usage of the data. An extension by Ionita-Laza et al. 
(2007) proposed testing all SNPs at the second stage 
using weighted Bonferroni. Extensions of this test- 
ing strategy are available for using parental pheno- 
types and arbitrary structures at the screening stage 
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(Feng, Zhang and Sha (2007)) and for case/control 
designs (Zheng et al. (2007)). 

The Van Steen approach has three key advan- 
tages: (1) The method achieves statistical power lev- 
els which can be substantially higher than those 
of standard family-based approaches and is thereby 
able to establish genome-wide significance with 
smaller/more realistic sample sizes (Van Steen et al. 
(2005b); Ionita-Laza et al. (2007); Feng, Zhang and 
Sha (2007); Zheng et al. (2007)). (2) The Van Steen 
algorithm maintains the separation between the mul- 
tiple testing problem and the replication process. 
Replication attempts in different studies are reserved 
for the generalization of the established associations 
and the assessment of heterogeneity between study 
populations. (3) Since genome-wide significance is 
established in the first data set, the number of SNPs 
that is pushed forward to true replication in other 
populations is generally very small and does not 
require a large budget, which makes simultaneous 
replication attempts in multiple samples feasible. 
Extensive simulation studies have shown that 2-stage 
testing strategies that utilize both sources of infor- 
mation about the association can help family-based 
studies to achieve power levels that are similar to 
those of population-based studies, while maintain- 
ing the original advantages of family-based study, 
that is, complete robustness against confounding. 

By looking at the distribution of parental mating 
types in ascertained samples f(P\Y,<&,6), Murphy 
et al. (2008) extended the general approach to the 
trio-designs in which all probands are affected (Y = 
1). Even here, the application of 2-stage Van Steen- 
testing strategies can lead to meaningful power im- 
provements over the standard TDT. Other possibil- 
ities for utilizing the information from the screen- 
ing step include specifying "tuning-parameters" in 
the FBAT-statistic (Lange et al. (2004); Jiang et al. 
(2006)) so that the power of the FBAT test is max- 
imized. 

5. DISCUSSION 

Family designs have historically been popular be- 
cause of their robustness to population substruc- 
ture. An additional, often unappreciated, feature of 
family-designs which is important with measured or 
time-to-onset outcomes is their robustness to model 
specification, and the ability to utilize the popula- 
tion information to specific unknown parameters in 
the model. With the availability of modern SNP 



chips, and genotyping of thousands of subjects on 
hundreds of thousands of markers, we now have the 
potential to identify the genetic backgrounds of in- 
dividuals, and utilize that information to control 
for confounding by population substructure in case- 
control studies (Roeder and Luca (2008)). An im- 
portant question is whether or not there is a need 
for family designs in the era of GWAS, given the 
potential to resolve difficulties with population sub- 
structure in case control designs. Additional stud- 
ies and experience with actual studies are needed 
to compare the performance of family designs and 
adjusted case-control designs in GWAS settings. 

Hampered by limitations in terms of power in many 
scenarios, and by the difficulty of recruitment, family- 
based designs certainly cannot be considered as the 
gold standard approach in genome-wide association 
studies. However, given the unique properties and 
features of a family design, they will continue to 
play a pivotal role in large scale association studies. 

In multi-stage genome- wide association studies, 
family-based studies should be utilized as one of the 
stages as early as the budget permits its implemen- 
tation. Their complete robustness against both ge- 
netic confounding and misspecification of the phe- 
notypic model provides them with an important role 
in the process of replicating and validating findings 
of the discovery step. Given the unavoidable genetic 
and phenotypic heterogeneity in large-scale multi- 
stage genome-wide association studies, this feature 
of family-based association tests is crucial and should 
not be ignored. If the budget permits the additional 
genotyping cost, family-studies can be a favorable 
choice for the first stage of a genome-wide associa- 
tion study. There, family-based studies can be de- 
signed so that they have equivalent power to 
population-based studies and, at the same time, of- 
fer a unique combination of additional analysis fea- 
tures and robustness properties. 

While the analysis features of family-based de- 
signs make them an attractive choice in the design 
phase of genome-wide studies, their abilities to as- 
sess the magnitude of the hidden genotyping error 
should always be utilized, even with case/control de- 
signs. By genotyping a small number of families on 
the same platform with the case/control samples, 
researchers can examine the genotyping quality of 
the data after the QC process and assess the true 
power of the study. 
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APPENDIX 

Here we do some simple calculations which illus- 
trate the power differences between case-control and 
trio designs. The basic idea is to calculate the ex- 
pected value of the corresponding Z statistics un- 
der the alternative. To make the calculations sim- 
ple, we use a relative risk model, and we assume 
that allele frequency, the relative risk and preva- 
lence are small. We use the following notation: p 
= disease allele frequency, p = relative risk, K = 
prevalence, r = P(Y = 1\X = 0), where Y = 1 in- 
dicates disease, and X = 1 indicates the recessive 
genotype. Assuming the Hardy-Weinberg Equilib- 
rium holds in the population, P(X = 1) = p 2 and 
K = prp 2 + r(l — p 2 ) =>■ r = K/{pp 2 + (1 — p 2 )). 

For the case-control design, we compute 

Pcases = P(X = l\Y = l) 
= rpp 2 /K and 

(5) 

p CO ntrols = P(X = l\Y = 0) 

= (l-rp)p 2 /(l-K), 

and letting p = (p cascs - p C ontrois)/2, we have that the 
expected Z is approximately 

(6) E(Z) = y/N(p caaes - ^controls)/ \/2p(l-p), 

where N is the number in each group. For N = 
1500, p = 0.1, K = 0.01 and p = 1.75, this gives 
Pcascs » 0.0174, Controls ~ 0.0099 and E[Z) « 1.75, 
which corresponds to the notion of zero power if 
a = 0.00001. 

For the trio design, we consider the 2 informa- 
tive mating types, that is, 2 heterozygous parents 
(Type 1) and one heterozygous parent and one rare 
homozygous parent (Type 2). Under the alternative 
hypothesis, the expected number of families for each 
mating type can be calculated by 

Type 1: rp 2 (l - p) 2 {p + 3)N/K, 
Type 2: 2rp 3 (l - p)\p + l)N/K. 

Next, we compute the Mendelian residuals which 
are defined as the expected marker score under the 
alternative hypothesis minus the expected marker 
score under the null-hypothesis for both mating types: 

Type 1: 3(p - l)/4(p + 1), 
Type 2: (p - l)/2{p + 1). 



The variance of the mating-types used in the de- 
nominators of the FBAT statistics are given by 3/16 
and 1/4 respectively. 

Then the expected FBAT-statistic for a recessive 
model under the alternative hypothesis is given by 

(?) m) = 2p(p-l)y/N(r/K)(l-p)(3 + p) 
^/p(p + 3)-5p + 9 

For the parameters given above, this equals Z = 
4.56, which results in the observed power levels of 
the plot for # = 0.01. 
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